Model Selection

Multimodal Control

# Multimodal Control

Spatialvla 4b 224 Sft Fractal

SpatialVLA is a vision-language-action model fine-tuned on the fractal dataset, primarily used for robot control tasks.

Transformers English

Openvla 7b Oft Finetuned Libero Object

OpenVLA-OFT is an optimized vision-language-action model that significantly improves speed and success rate through fine-tuning techniques.

Multimodal Fusion

Stable Diffusion 3.5 Large Controlnet Blur

Blur control network based on the Stable Diffusion 3.5 large model, used for generating content controlled by blurry images

Image Generation English

HiCo is a hierarchical controllable diffusion model specifically designed for layout-to-image generation tasks.

Image Generation English

CSGO is a PyTorch implementation for text-to-image generation, supporting image-driven style transfer, text-driven stylized synthesis, and text-editing-driven stylized synthesis.

Image Generation English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase